6 research outputs found

    Improving Representation Learning for Deep Clustering and Few-shot Learning

    Get PDF
    The amounts of data in the world have increased dramatically in recent years, and it is quickly becoming infeasible for humans to label all these data. It is therefore crucial that modern machine learning systems can operate with few or no labels. The introduction of deep learning and deep neural networks has led to impressive advancements in several areas of machine learning. These advancements are largely due to the unprecedented ability of deep neural networks to learn powerful representations from a wide range of complex input signals. This ability is especially important when labeled data is limited, as the absence of a strong supervisory signal forces models to rely more on intrinsic properties of the data and its representations. This thesis focuses on two key concepts in deep learning with few or no labels. First, we aim to improve representation quality in deep clustering - both for single-view and multi-view data. Current models for deep clustering face challenges related to properly representing semantic similarities, which is crucial for the models to discover meaningful clusterings. This is especially challenging with multi-view data, since the information required for successful clustering might be scattered across many views. Second, we focus on few-shot learning, and how geometrical properties of representations influence few-shot classification performance. We find that a large number of recent methods for few-shot learning embed representations on the hypersphere. Hence, we seek to understand what makes the hypersphere a particularly suitable embedding space for few-shot learning. Our work on single-view deep clustering addresses the susceptibility of deep clustering models to find trivial solutions with non-meaningful representations. To address this issue, we present a new auxiliary objective that - when compared to the popular autoencoder-based approach - better aligns with the main clustering objective, resulting in improved clustering performance. Similarly, our work on multi-view clustering focuses on how representations can be learned from multi-view data, in order to make the representations suitable for the clustering objective. Where recent methods for deep multi-view clustering have focused on aligning view-specific representations, we find that this alignment procedure might actually be detrimental to representation quality. We investigate the effects of representation alignment, and provide novel insights on when alignment is beneficial, and when it is not. Based on our findings, we present several new methods for deep multi-view clustering - both alignment and non-alignment-based - that out-perform current state-of-the-art methods. Our first work on few-shot learning aims to tackle the hubness problem, which has been shown to have negative effects on few-shot classification performance. To this end, we present two new methods to embed representations on the hypersphere for few-shot learning. Further, we provide both theoretical and experimental evidence indicating that embedding representations as uniformly as possible on the hypersphere reduces hubness, and improves classification accuracy. Furthermore, based on our findings on hyperspherical embeddings for few-shot learning, we seek to improve the understanding of representation norms. In particular, we ask what type of information the norm carries, and why it is often beneficial to discard the norm in classification models. We answer this question by presenting a novel hypothesis on the relationship between representation norm and the number of a certain class of objects in the image. We then analyze our hypothesis both theoretically and experimentally, presenting promising results that corroborate the hypothesis

    Reducing Objective Function Mismatch in Deep Clustering with the Unsupervised Companion Objective

    Get PDF
    Preservation of local similarity structure is a key challenge in deep clustering. Many recent deep clustering methods therefore use autoencoders to help guide the model's neural network towards an embedding which is more reflective of the input space geometry. However, recent work has shown that autoencoder-based deep clustering models can suffer from objective function mismatch (OFM). In order to improve the preservation of local similarity structure, while simultaneously having a low OFM, we develop a new auxiliary objective function for deep clustering. Our Unsupervised Companion Objective (UCO) encourages a consistent clustering structure at intermediate layers in the network -- helping the network learn an embedding which is more reflective of the similarity structure in the input space. Since a clustering-based auxiliary objective has the same goal as the main clustering objective, it is less prone to introduce objective function mismatch between itself and the main objective. Our experiments show that attaching the UCO to a deep clustering model improves the performance of the model, and exhibits a lower OFM, compared to an analogous autoencoder-based model

    Kartlegging og overvåking av reinbeiter

    Get PDF
    Denne rapporten beskriver arbeidet med satellittbasert kartlegging av vinterbeite i indre Finnmark. Prosjektet utgjør en ny oppdatering av reinbeitene på Finnmarksvidda og inngår i ”Overvåkingsprogrammet for Indre Finnmark”. Dette programmet ble initiert i 1998, med oppdateringer i 2005/2006, 2009/2010, 2013 og 2018. Overvåkingsprogrammet ble opprinnelig inndelt i to deloppdrag – et for registrering av bakkedata og et for kartlegging og arealberegning basert på satellitt data.Kartlegging og overvåking av reinbeiterpublishedVersio

    Deep Image Clustering with Tensor Kernels and Unsupervised Companion Objectives

    Get PDF
    Deep image clustering is a rapidly growing branch of machine learning and computer vision, in which deep neural networks are trained to discover groups within a set of images, in an unsupervised manner. Deep neural networks have proven to be immensely successful in several machine learning tasks, but the majority of these advances have been in supervised settings. The process of labeling data for supervised applications can be extremely time-consuming, or even completely infeasible in many domains. This has led researchers to shift their focus towards the deep clustering field. However, this field is still in its infancy, meaning that it includes several open research questions, regarding e.g. the design and optimization of the algorithms, the discovery of meaningful clusters, and the initialization of model parameters. In an attempt to address some of these open questions, a new algorithm for deep image clustering is developed in this thesis. The proposed Deep Tensor Kernel Clustering (DTKC) consists of a convolutional neural network (CNN), which is trained to reflect a common cluster structure at the output of all its intermediate layers. Encouraging a consistent cluster structure throughout the network has the potential to guide it towards meaningful clusters, even though these clusters might appear to be nonlinear in the input space. The cluster structure is enforced through the idea of companion objectives, where separate loss functions are attached to each of the layers in the network. These companion objectives are constructed based on a proposed generalization of the Cauchy-Schwarz (CS) divergence, from vectors to tensors of arbitrary rank. Generalizing the CS divergence to tensor-valued data is a crucial step, due to the tensorial nature of the intermediate representations in the CNN. Furthermore, an alternate initialization strategy based on self-supervised learning, is also employed. To the author's best knowledge, this is the first attempt at using this particular self-supervised learning approach to initialize a deep clustering algorithm. Several experiments are conducted to thoroughly assess the performance of the proposed DTKC model, with and without self-supervised pre-training. The results show that the models outperform, or perform comparable to, a wide range of benchmark algorithms from the literature

    Unsupervised Feature Extraction – A CNN-Based Approach

    No full text
    Working with large quantities of digital images can often lead to prohibitive computational challenges due to their massive number of pixels and high dimensionality. The extraction of compressed vectorial representations from images is therefore a task of vital importance in the field of computer vision. In this paper, we propose a new architecture for extracting such features from images in an unsupervised manner, which is based on convolutional neural networks. The model is referred to as the Unsupervised Convolutional Siamese Network (UCSN), and is trained to embed a set of images in a vector space, such that local distance structure in the space of images is approximately preserved. We compare the UCSN to several classical methods by using the extracted features as input to a classification system. Our results indicate that the UCSN produces vectorial representations that are suitable for classification purposes

    RELAX: Representation Learning Explainability

    Get PDF
    Despite the significant improvements that self-supervised representation learning has led to when learning from unlabeled data, no methods have been developed that explain what influences the learned representation. We address this need through our proposed approach, RELAX, which is the first approach for attribution-based explanations of representations. Our approach can also model the uncertainty in its explanations, which is essential to produce trustworthy explanations. RELAX explains representations by measuring similarities in the representation space between an input and masked out versions of itself, providing intuitive explanations that significantly outperform the gradient-based baselines. We provide theoretical interpretations of RELAX and conduct a novel analysis of feature extractors trained using supervised and unsupervised learning, providing insights into different learning strategies. Moreover, we conduct a user study to assess how well the proposed approach aligns with human intuition and show that the proposed method outperforms the baselines in both the quantitative and human evaluation studies. Finally, we illustrate the usability of RELAX in several use cases and highlight that incorporating uncertainty can be essential for providing faithful explanations, taking a crucial step towards explaining representations
    corecore